Search CORE

79 research outputs found

Trilogy on Computing Maximal Eigenpair

Author: AN Langville
D Noutsos
GH Golub
J Solomon
MF Chen
MF Chen
MF Chen
Publication venue
Publication date: 21/11/2017
Field of study

The eigenpair here means the twins consist of eigenvalue and its eigenvector. This paper introduces the three steps of our study on computing the maximal eigenpair. In the first two steps, we construct efficient initials for a known but dangerous algorithm, first for tridiagonal matrices and then for irreducible matrices, having nonnegative off-diagonal elements. In the third step, we present two global algorithms which are still efficient and work well for a quite large class of matrices, even complex for instance.Comment: Updated versio

arXiv.org e-Print Archive

Crossref

Random Surfing Without Teleportation

Author: AN Langville
AN Nikolakopoulos
E Seneta
G Frobenius
HA Simon
HA Simon
K Avrachenkov
O Perron
P Boldi
PG Constantine
PJ Courtois
R Baeza-Yates
RA Horn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/04/2016
Field of study

In the standard Random Surfer Model, the teleportation matrix is necessary to ensure that the final PageRank vector is well-defined. The introduction of this matrix, however, results in serious problems and imposes fundamental limitations to the quality of the ranking vectors. In this work, building on the recently proposed NCDawareRank framework, we exploit the decomposition of the underlying space into blocks, and we derive easy to check necessary and sufficient conditions for random surfing without teleportation.Comment: 13 pages. Published in the Volume: "Algorithms, Probability, Networks and Games, Springer-Verlag, 2015". (The updated version corrects small typos/errors

arXiv.org e-Print Archive

Crossref

Semantic distillation: a method for clustering objects by their contextual specificity

Author: AN Langville
AN Langville
Chris Godsil and Gordon Royle
CJ Rijsbergen van
DM Cvetković
F Fouss
I Yanai
J Mercer
J Shi
JC Bezdek
K Pearson
LA Zadeh
M Belkin
M Campanino
Miklós Rédei
MLD Chiara
MW Berry
N Aronszajn
P Baldi
P Gärdenfors
R Baeza-Yates
R Fan
R Homayouni
RR Coifman
S Vishveshwara
ST Wang
Sándor Dominich
Publication venue
Publication date: 01/01/2007
Field of study

Techniques for data-mining, latent semantic analysis, contextual search of databases, etc. have long ago been developed by computer scientists working on information retrieval (IR). Experimental scientists, from all disciplines, having to analyse large collections of raw experimental data (astronomical, physical, biological, etc.) have developed powerful methods for their statistical analysis and for clustering, categorising, and classifying objects. Finally, physicists have developed a theory of quantum measurement, unifying the logical, algebraic, and probabilistic aspects of queries into a single formalism. The purpose of this paper is twofold: first to show that when formulated at an abstract level, problems from IR, from statistical data analysis, and from physical measurement theories are very similar and hence can profitably be cross-fertilised, and, secondly, to propose a novel method of fuzzy hierarchical clustering, termed \textit{semantic distillation} -- strongly inspired from the theory of quantum measurement --, we developed to analyse raw data coming from various types of experiments on DNA arrays. We illustrate the method by analysing DNA arrays experiments and clustering the genes of the array according to their specificity.Comment: Accepted for publication in Studies in Computational Intelligence, Springer-Verla

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL-Rennes 1

The intellectual influence of economic journals: quality versus quantity

Author: Alexandru Nichifor
AN Langville
B Golub
CT Bergstrom
CT Bergstrom
DE Cambell
DE Campbell
DN Laband
E Garfield
G Pinski
GBE Jemec
J Sobel
JA Mirrlees
JE Hirsch
LÁ Kóczy
László Á. Kóczy
M Marcus
MH Groot De
P Kalaitzidakis
PP Combes
R Smith
S Brin
SJ Liebowitz
T Braun
T Coupé
T Opátrný
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The evaluation of scientific output has a key role in the allocation of research funds and academic positions. Decisions are often based on quality indicators for academic journals, and over the years, a handful of scoring methods have been proposed for this purpose. Discussing the most prominent methods (de facto standards) we show that they do not distinguish quality from quantity at article level. The systematic bias we find is analytically tractable and implies that the methods are manipulable. We introduce modified methods that correct for this bias, and use them to provide rankings of economic journals. Our methodology is transparent; our results are replicable

Maastricht University Research Portal

Crossref

Repository of the Academy's Library

University of St. Andrews - Pure

Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

Author: A Banerjee
A Clauset
A Misbahuddin
A Ruepp
AK Jain
AL Barabási
AN Langville
AP Erdös
B Long
CJ Sylvester
D Lee
D Lee
D Zhou
E Hüllermeier
E Ravasz
Fabian J Theis
Florian Blöchl
G Karypis
G Palla
H Cho
I Dhillon
J Bezdek
J Dunn
JB MacQueen
JB Pereira-Leal
K Devarajan
KI Goh
KV Mardia
M Barber
M Campos
M Fiorio
MA Yildirim
Mara L Hartsperger
N Gulbahce
P Paatero
P Wong
R Montanez
RC Samaco
RJ Shprintzen
RR Lebel
S Bauer
S Klamt
S Maslov
T Barnickel
Volker Stümpflen
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of <it>k</it>-partite graphs. These graphs contain <it>k </it>different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type. Results Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a <it>k</it>-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted <it>k</it>-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2. Conclusions In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy <it>k</it>-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PuSH

Hidden dynamics of soccer leagues: the predictive ‘power’ of partial standings

Author: A Agresti
A Heuer
A Heuer
A Heuer
A Switzer
A Tsokos
AC Constantinou
Alexander J. Bond
AN Langville
Anthony C Constantinou
B Buraimo
B ter Weel
Ben Jones
CA Caro
CB Beggs
Clive B. Beggs
D Forrest
D Mease
D Sumpter
EC Balreira
EF Saraiva
F Louzada
HW Borchers
I Mchale
J Blumrodt
J Lasek
J Lasek
J Quirk
JP Keener
K Massey
K Suzuki
KD Dayaratna
LB Baker
N Elias
R Fagin
RA Bradley
RP Bunker
S Burer
S Shin
Stacey Emmonds
SW Flint
T van der Zaan
TP Chartier
WN Colley
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2019
Field of study

Objectives Soccer leagues reflect the partial standings of the teams involved after each round of competition. However, the ability of partial league standings to predict end-of-season position has largely been ignored. Here we analyze historical partial standings from English soccer to understand the mathematics underpinning league performance and evaluate the predictive ‘power’ of partial standings. Methods Match data (1995-2017) from the four senior English leagues was analyzed, together with random match scores generated for hypothetical leagues of equivalent size. For each season the partial standings were computed and Kendall’s normalized tau-distance and Spearman r-values determined. Best-fit power-law and logarithmic functions were applied to the respective tau-distance and Spearman curves, with the ‘goodness-of-fit’ assessed using the R2 value. The predictive ability of the partial standings was evaluated by computing the transition probabilities between the standings at rounds 10, 20 and 30 and the final end-of-season standings for the 22 seasons. The impact of reordering match fixtures was also evaluated. Results All four English leagues behaved similarly, irrespective of the teams involved, with the tau-distance conforming closely to a power law (R2>0.80) and the Spearman r-value obeying a logarithmic function (R2>0.87). The randomized leagues also conformed to a power-law, but had a different shape. In the English leagues, team position relative to end-of-season standing became ‘fixed’ much earlier in the season than was the case with the randomized leagues. In the Premier League, 76.9% of the variance in the final standings was explained by round-10, 87.0% by round-20, and 93.9% by round-30. Reordering of match fixtures appeared to alter the shape of the tau-distance curves. Conclusions All soccer leagues appear to conform to mathematical laws, which constrain the league standings as the season progresses. This means that partial standings can be used to predict end-of-season league position with reasonable accuracy

Crossref

Directory of Open Access Journals

Leeds Beckett Repository